Demographic, social, and behavioral correlates of SARS-CoV-2 seropositivity in a representative, population-based study of Minnesota residents

Background Monitoring COVID-19 infection risk in the general population is a public health priority. Few studies have measured seropositivity using representative, probability samples. The present study measured seropositivity in a representative population of Minnesota residents prior to vaccines and assess the characteristics, behaviors, and beliefs of the population at the outset of the pandemic and their association with subsequent infection. Methods Participants in the Minnesota COVID-19 Antibody Study (MCAS) were recruited from residents of Minnesota who participated in the COVID-19 Household Impact Survey (CIS), a population-based survey that collected data on physical health, mental health, and economic security information between April 20 and June 8 of 2020. This was followed by collection of antibody test results between December 29, 2020 and February 26, 2021. Demographic, behavioral, and attitudinal exposures were assessed for association with the outcome of interest, SARS-CoV-2 seroprevalence, using univariate and multivariate logistic regression. Results Of the 907 potential participants from the CIS, 585 respondents then consented to participate in the antibody testing (64.4% consent rate). Of these, results from 537 test kits were included in the final analytic sample, and 51 participants (9.5%) were seropositive. The overall weighted seroprevalence was calculated to be 11.81% (95% CI, 7.30%-16.32%) at of the time of test collection. In adjusted multivariate logistic regression models, significant associations between seroprevalence and the following were observed; being from 23–64 and 65+ age groups were both associated with higher odds of COVID-19 seropositivity compared to the 18–22 age group (17.8 [1.2–260.1] and 24.7 [1.5–404.4] respectively). When compared to a less than $30k annual income reference group, all higher income groups had significantly lower odds of seropositivity. Reporting practicing a number of 10 (median reported value in sample) or more of 19 potential COVID-19 mitigation factors (e.g. handwashing and mask wearing) was associated with lower odds of seropositivity (0.4 [0.1–0.99]) Finally, the presence of at least one household member in the age range of 6 to 17 years old was associated with higher odds of seropositivity (8.3 [1.2–57.0]). Conclusions The adjusted odds ratio of SARS-CoV-2 seroprevalence was significantly positively associated with increasing age and having household member(s) in the 6–17 year age group, while increasing income levels and a mitigation score at or above the median were shown to be significantly protective factors.


Introduction
In the first year of the COVID 19 pandemic, population-specific SARS-CoV-2 seroprevalence studies were conducted in a variety of groups, such as healthcare and emergency workers [1][2][3][4][5][6], office workers, children, and pregnant women [7][8][9], to understand the epidemiology of the disease. These data allowed for more complete case ascertainment than symptomatic diagnostic testing alone and provided important insights into the risk of infection for specific types of exposure.
To address these issues, the present study utilized a representative probability sample of Minnesota residents, collecting data by survey between April 20 and June 8 of 2020 on physical and mental health, economic security, and behavior changes in relation to COVID, followed by serologic testing between late December 2020 and late February 2021. These unique linked data-hereafter referred to as the Minnesota COVID-19 Antibody Study or MCAS-allowed us to measure seropositivity in a representative population prior to vaccines and to assess the characteristics, behaviors, and beliefs of the population and their association with subsequent infection.

Study design and population
Participants in MCAS were recruited by NORC from the residents of Minnesota who participated in the COVID-19 Household Impact Survey (CIS), a statistically valid, population-based survey that collected data on physical health, mental health, and economic security from the U.S. adult household population nationwide and for 18 regional areas including 10 states (CA,  CO, FL, LA, MN, MO, MT, NY, OR, TX) and 8 Metropolitan Statistical Areas (Atlanta, Baltimore, Birmingham, Chicago, Cleveland, Columbus, Phoenix, Pittsburgh) during the early months of the pandemic [20,21]. Methods for selecting the sample and conducting the surveys are described in Swaziek,et al. and Wozniak [20].
In the original CIS, data were collected on basic demographic characteristics, household size and income, current and underlying physical health, mental health, economic security information, and personal behaviors around COVID-19. Data were collected on nineteen mitigation behaviors (including handwashing and mask-wearing but also a wide range of other practices that polls at the time were tracking such as whether or not participants had prayed, stockpiled food and water, wiped packages entering their homes, avoided high risk people, avoided some or all restaurants, and cancelled or postponed various activities).
For the purposes of the MCAS follow-up study, age in the CIS data was collapsed into 3 categories (18)(19)(20)(21)(22), older than 64), race/ethnicity variables were collapsed into two categories (non-white and white), and education level was categorized as less than college degree, associates/bachelor's degree, and post-grad degree. Self-reported general health status was categorized as excellent/very good or good/fair/poor. COVID-related symptoms at the time of the initial survey and a history of any of the CDC-listed risk factors for severe COVID-19 at the time of the CIS (diabetes, high blood pressure or hypertension, heart disease, heart attack or stroke, asthma, chronic lung disease and COPD, bronchitis and emphysema, allergies, a mental health condition, cystic fibrosis, liver disease or end stage liver disease, cancer, a compromised immune system, and overweight or obesity) were each categorized as "yes" or "no." Questions in the CIS survey which screened for depression were combined to indicate the presence or absence ("yes" or "no") of any poor mental health days. The CIS survey asked about 19 behaviors that people may engage in to reduce their risk of acquiring COVID-19. We incorporated these questions into our analysis in two ways. First, we dichotomized the total count of reported behaviors around the sample median of 10 reported behaviors. We also examined if respondents engaged in specific behaviors of masking and social distancing, which have since been shown to meaningfully reduce COVID risk [22,23]. We have termed this variable the "Big 2", and it is coded as reporting masking and social distancing, reporting one of the two, or reporting neither. Finally, a variable was constructed to indicate the presence of children of various ages in the household compared to only adults.

Serosurvey
The Minnesota CIS sample surveyed 1,071 unique respondents, of whom 912 consented to be re-contacted and 907 provided complete contact information, including email addresses. This group served as the potential participants in the serosurvey, and NORC sent recruitment emails to these individuals with a web link and unique PIN offering participation in the antibody testing program. Respondents were pointed to a consent portal where they would sign up to receive a capillary blood collection kit mailed to their home, to self-collect capillary blood using Neoteryx Mitra1 10 μl samplers by volumetric absorption of were offered a $25 incentive as well as their antibody test results for participation. Of the 907 potential participants, 585 respondents then consented to participate in the antibody testing (64.4% consent rate). Of these, 581 were sent kits, and 540 test kits were returned, and results from 537 test kits with complete survey data were included in the final analytic sample. Specimens were then tested using the Quansys Q-Plex™ SARS-CoV-2 Human IgG (Quansys Biosciences, Logan, UT) [24].

Study variables
The primary outcome of interest in this study was SARS-CoV-2 seropositivity. The primary exposures of interest were age, sex, race/ethnicity (white/non-white), income, population density of place of residence (defined as rural, suburban, or urban), education level, household make-up, at/above median mitigation score, "Big 2" score, and poor mental health days (yes/ no).

Statistical analysis
Prior to analysis of the CIS data, an iterative raking process was used to adjust for any survey nonresponse as well as any non-coverage or under and oversampling. Raking variables included age, gender, race/ethnicity, education, and county groupings based on county level counts of the number of COVID-19 deaths. Demographic weighting variables were obtained from the 2018 American Community Survey. The weighted data reflect the population of adults aged 18 and over in each region. The overall weighted seroprevalence was adjusted for testing error, using the following formula [25] below, sensitivity/specificity estimates from Quansys3027, and the methodology described by Demmer, et al. [26]: adjusted prevalence ¼ crude prevalenceþspecificityÀ 1 sensitivityþspecificityÀ 1 SAS version 9.4 was used for statistical analyses, including univariate descriptive statistics, univariate logistic regressions assessing the association between seropositivity and each factor of interest, as well as multivariate logistic models for each variable of interest adjusted for the age, sex, income, population density, simplified education level, household make-up, at/above median mitigation score, and poor mental health days (yes/no) reported by participants.
Informed consent was obtained electronically within the survey for participants in the CIS, electronically via an online web portal for MCAS participants, and the study was approved by the University of Minnesota Institutional Review Board (#STUDY00011527). Table 1 summarizes the characteristics of the study sample as well as unadjusted seroprevalence of the sample according to key variables of interest. The unadjusted seropositivity rate for the study population was 9.5% (51 out of the 537 returned test kits). The weighted and adjusted rate was 11.81% (95% CI, 7.30%-16.32%). Weighted seroprevalence varied by population density, ranging from 20.81% (95% CI, 4.23%-37.39%) in rural areas to 11.92% (95% CI, 6.24%-17.59%) in urban areas to 3.78% (95% CI, 0.57% -6.99%) in suburban areas. Weighted seroprevalence rates were observed to be at least five percentage points higher than the population rate for males (13.77%; 95% CI, 6.61% -20.92%), individuals with 1-2 children in the household (19.78%; 95% CI, 5.72% -33.83%), and those with school-age children between 6 and 17 years (21.41%; CI, 7.65% -35.18%).

Results
In the multivariate analyses (Table 2), the following demographic variables were significantly related to a higher likelihood of seropositivity: 1) age 23-64 years (OR = 17.79; 95% CI, 1.22-260.08) compared to the 18-22 years group; 2) age 65 years or more (OR = 24.68; 95% CI, 1.51-404.44) compared to the 18-22 years group; and 3) and respondents reporting having school-age children (aged 6-17 years) in the household (OR = 8.253; 95% CI, 1.20-56.98) compared to not having children in this age group. Factors that decreased the likelihood of seropositivity include earning between $30,000 -$60,000 per year (OR = 0.20; 95% CI, 0.05-0.91) and earning more than $125,000 per year (OR = 0.14; 95% CI, 0.02-0.76) compared to those earning less than $30k. None of the potential risk factors-good/fair/poor health status, COVID-related symptoms at the time of the survey, presence of high-risk health conditions, or poor mental health-departed meaningfully from the overall statewide seroprevalence rate (Table 1) or attained statistical significance in the multivariate analyses (Table 2). Our examination of possible protective factors, namely engaging in personal public healthoriented behaviors such as postponing work-related activities or avoiding public or crowded places, showed that those who engaged in less than the median number of those efforts had higher rates of seropositivity than what was observed at the statewide level (14.54%; 95% CI, 7.85% -21.23%: see Table 1). In the multivariate model (Table 2), engaging in higher than the median number of protective/mitigation behaviors lowered the chances of becoming seropositive (OR = 0.36; CI, 0.23-0.99).

Discussion
The MCAS study links demographic and behavioral data from early in the COVID-19 pandemic to serostatus six months later. During this 6-month period, the State of Minnesota reached a peak reported seropositivity rate of 17.2% in November 2020 [27]. The overall MCAS seropositivity rate (11.81%) was lower than the estimated seroprevalence of 15.9% (95% CI, 13.3%-18.6%) observed for Minnesota in the CDC nationwide commercial laboratory survey in the first half January 2021 [27]. During this 6-month period, Minnesota also experienced rises in hospitalizations and deaths that paralleled the experiences of many other parts of the US.
Three observations were robust. First, our multi-variate regression found that older age groups were associated with higher odds of seropositivity and that living in a higher income household was associated lower odds of seropositivity. A recent COVID-19 seroprevalence study conducted in the city of Belém, Brazil also found older age and lower income to be associated with seropositivity in the early waves of the pandemic [28]. These parallel findings suggest age and income impact risk of seropositivity even in distinct cultural and economic settings, and further demonstrate socio-economic disparities in COVID-19 risk [29,30].
Our data also show that adults with school-aged children in their household had more than eight times the odds of seroprevalence after adjusting other variables. The significance and magnitude of the association between seropositivity and living with children of school age suggest that there are COVID-19 risk factors associated with the circumstances that accompany raising children. This may support the fact that school-age children can become infected and transmit SARS-CoV-2 infections and contribute to family and community spread; however, our observational findings cannot determine whether this is the specific driving force of our findings, or whether some other feature associated with having school-aged children made respondents to the survey more vulnerable to the spread of COVID-19. Notably, the largest school districts in Minnesota were operating fully remotely for the period of our study. Further research should be conducted to determine if school-aged children are meaningful contributors to transmission, to inform whether broader testing efforts in schools might be a useful tool to identify infectious individuals, prevent outbreaks, and protect vulnerable members of the community. The second robust finding relates to behavior intended to mitigate personal and collective risk. A mitigation score at or above the median (engaging in more than 10 of these behaviors) was associated with an adjusted odds ratio of 0.357 (95% CI, 0.128-0.994), indicating that those who-early on-took the pandemic more seriously or changed more of their behaviors were less likely to test positive for the presence of antibodies. While not all of these mitigation measures directly affected one's likelihood of infection, this metric seems to have captured a more general attitude.
Although this study has several strengths, including its deployment of state-of-the-art survey methods, its probability-based sampling approach, the temporal nature of the study design, the unprecedented inclusion of a wide array of social, behavioral, and attitudinal correlates of infection, and the use of a de-centralized capillary blood data collection protocol with high fidelity, it is important to note some potentially important limitations.
First, some racial and ethnic groups were underrepresented in the study relative to the nation as a whole, reflecting the population demographics of Minnesota. Second, the participants represent a group of individuals inclined to participate in studies such as this given their participation in the COVID-19 Household Impact Survey and consent to be re-contacted. These may be people generally inclined to engage in various other forms of prosocial behavior such as mask wearing and social distancing, suggesting the possibility of confounding by indication. Third, COVID-19 vaccines were being made available to healthcare workers and other Bolding in Table 2 indicates a statistically significant finding with a p-value > 0.05 for the parameter estimate of odds ratio/adjusted odds ratio of seroprevalence. a Multivariate logistic models were adjusted for age, sex, income, population density, simplified education level, high priority groups during the blood sample collection phase of the study. Respondents did not report whether or not they had been vaccinated when returning their samples, so it is not possible to control for or adjust our samples for potential vaccination. However, the impact of early vaccine access on our results is likely to be small, since access was highly restricted during this period and many individuals with vaccine access had only received one dose, which has been shown to be unlikely to generate a seropositive result [31]. The one exception is the population 65 and older, where access had risen such that a quarter of the population had completed the 2-dose vaccine series by the end of our collection period, up from 0.3% at the collection midpoint. We therefore interpret the positive association between older age and seropositivity in our results with some caution, as this is the one dimension in which vaccine access may have generated positivity, in addition to infection. Fourth, respondents did not report whether they had been formally diagnosed with COVID-19 between the baseline survey and bio-sample collection, so we cannot distinguish between previously detected and unreported cases. Fifth, the sample sizes involved often yielded wide confidence intervals at the more granular cuts at the data. Sixth, despite extensive questions on demographic and behavioral COVID-19 risk factors, some factors that could meaningfully impact seroprevalence including substance use before and during the pandemic and occupation in high-risk fields (e.g. healthcare and service workers) were not asked about in CIS. Seventh, we assumed all positive and negative COVID-19 cases as being true cases. In reality, the Human IgG (4-Plex) assay has a reported sensitivity of 97% and specificity of 100% [23]. This imperfect sensitivity along with our long study window may be biasing our seroprevalence estimates towards a lower value. Finally, we acknowledge that many of the variables, such as symptoms, mental health, and personal and public health mitigation behaviors likely varied between the time of the initial survey and the subsequent blood specimen collection. We are in the process of fielding a follow-up study that deploys a design that supports a more contemporaneous assessment.

Conclusion
Pairing data on pandemic attitudes and behaviors with serologic results provides the most complete insight into transmission risk factors to inform further epidemiologic study. Important risk factors such as school-age children in the household and protective factors such as personal mitigation behaviors suggest that public health planners should focus on these issues as they deal with the current outbreak and those that may emerge in the future.
Supporting information S1 File.